(19 



J 



Europaisches Patantamt 
European Patent Office 
Office europ6en des bravete 




@ Publication number: 0 587 437 A2 



EUROPEAN PATENT APPLICATION 



Application number : 93307149.0 
Date of fOing : 10.09.93 



@ int CI.': H03M 7/30, G11B 20/10 



@ Priority: 11.09.92 US 943613 

@ Date of publication of application : 
16l03.94 Bulletin 94/11 

@ Designated Contracting States : 
DEFRGB 

@ Applicant: international Business machines 
Corporation 
Old Orchard Vtoad 
Armonk, N.Y. 10504 (US) 



@ Inventor : lOilaicoMrsId, John Edward 
7541 East Knoliwood Place 
Tucson, AZ 85715 (US) 
Inventor : Means, Rodney Jerome 
6988 E. C^le Cerca 
Tucson, AZ 85715 (US) 

@ Representative : Moss, Rol>ert Douglas 
IBM United Kingdom Ltalted inteilecluai 
Property Department Hursley Park 
Winchester HampshirB S021 2JN (GB) 



CO 
00 



D. 

Ill 



HOST PROCESSOR SELECTS 
A DATA TRANSFER UNIT(DTU) 

OF DATA BLOCKS FDR 
RECORDING AS A GROUP OF 
COMPRESSED DATA BUX^KS 



@ E>ate compression/decomprssslon and storage of comprassed and uncompressed date on a singto 
date storage volume. 

@ A date file having a plurality of date blocks is 

divided into one or more transfer unite of date '"^ 
bMxs, Before date storage, each transfer unit 
of date l>locks is sufajeded to its own data 
compression cycle to create a group of oorrv 
pressed data blocks. The size of the data trans- 
fer unit, in bytes, is selected to be fade for 
addressing and retrieving Indivklual recorded 
groups of compressed data blocks whQe provid- 
ing good ctiarmei utOizatkm and compresswn 
effk^ienc^. Also the data-transfer unit size is 
selected in part based upon date storage effh 
denc^, Le. the storage of the compressed data 
shouki fll as many addressable data storage 
areas as possible. Upon recording each group 
of compressed date bytes, an entry is made into 
a file directory for enabling addressing the re- 
corded compressed date blocks. 



HOST PROCESSOR ALLOCATES 
ANUM8ER OF ADDRESSABLE 
AREAS FOR STORING THE 
SELECTED DTU OF DATA 
BLOCKS 



OTUOFDATA BUOCKS IS 
TRANSMITTED TO DATA 
STORAGE SYSTEM 



DATA STORAGE SYSTEM 
COMPRESSES DTU IN TO A 
GROUP OF COMPRESSED 
DATA BLOCKS a RECORDS 
THE GROUP AS ONE 
CONHNUUM OF DAIA 



DATASTORAGE SYSTEM 
SENDS DEHULED STATUS TO 
HOSTPROCESSOR AS TO THE 
REC0R0IN60F THE GROUP 
OF COMPRESSED DATA 
BLOCKS 
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HOST PROCESSOR 
ASSOQATES ALL GROUPS 
OF DATA BLOCKS 
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FIELD OF THE INVEIsmON 

This invention relates to data storage systems that are capable of storing both compressed and unconv 
pressed data on one data storage volume and to data processing systems utilizing such data storage systems. 
s This invention also relates to data storage systems that minimize wasted data storage space on a data storage 
volume while storing compressed data. 

BACKGROUND OF THE INVENTION 

10 Many data storage media, such as data storage optical 6isks, have a so-called fixed block architecture 
(FBA) format Such format is characterized in an optical disk by so-called hard sectoring the disk's single spiral 
track nnlo a plurality <rf sectors. Everyone of the sectors have Mentteal data storage capacity. i.e. 512 bytes, 
1 024 bytes, 4096 bytes, etc Because of the FBA disks and the variability of data lengths of compressed data 
with respect to the source uncompressed data, ir>-line data compresston has not been employed with FBA for- 
ts matted disks. It is desired to eff idenlJy store and enable simple random address accessing of a variable amount 
of compressed data resulting from compressing data whk^h has been formatted into addressable bfocks. Such 
compressed data can then be recorded on a FBA formatted disk. If the sector data does not compress to fewer 
bytes, then the data are stored without data compressk)n on the data storage disk. 

It is also desired when a plurality of addressable data blocks is segmented into a plurality of groups of such 

20 data blocks, to maintain host processor addressability of the compressed data blocks within each compressed 
group of data blocks. It is also desired when compressing data for storage on a FBA storage medium to maintain 
a maximal addressal>ility of all unused data storing sectors even though the number of sectors required to store 
the compressed data blocks is unknown. A further desire is to provkJe for random addressing of the com- 
pressed data blocks recorded in an FBA formatted storage medium. 

25 The data pattern randomness of most input data streanr^s and the variabflity in the resulting length of the 
compressed data output after the application of the various compresston algorfthms, does not allow for the 
predtetion of the amount of storage space required to contain the compressed date. This situation requires a 
link l>etween the transmission of the date stream to be compressed and recorded and the resuHs of the com- 
pression process to assist the host processor in its storage n^nagement process. 

30 The functton of updating a date f Qe in this environment can not use any usual date updating process (read, 
update, write tiack) because the date pattern as a result of the update may not compress to the same degree 
as the original data btock and therefore updated compressed date most prot>ably will not fit in the original stor- 
age space required to store the original date. 

In a fixed block architecture (FBA) environment, date are recorded on a date storage medium in fixed sized 

35 unite of storage called sectors where each recording track on the medium contains a fixed number of such sec- 
tors. TTie SKtdressing convention for optical disk devices consiste of a track address on the medium and a sector 
number of the particular track. On optical media storage devices, each of the sectors consiste of two major 
parte; an Identification field (ID) used by the devtee controller to locate a particular sector by a physical address 
and adatefieldfor storing date. The informational content of the ID'son hard sectored optical disks are indelibly 

40 recorded, as by a stamping/nnolding process, on the medium at the time of manufacture. Other date storage 
formate also are usable to practice the present invention, such as the known count-key-date (CKD) and ex- 
tended count-key^ate (ECKD) formate used on many magnetic disk media. 

An FBA device attached to a host via the known Small Computer Standard Intertece (SCSI) must provkte 
the capability to resolve a Logical Block Address (LBA) used by SCSI archftected direct-access date storage 

45 devk:es to £Kklress fixed sized units of storage to a unique physttal address (track and sector) on the medium. 
The SCSI attached FBA devtee provkles to the host a contiguous address space of N (N is a positive mteger) 
stors^ locations whtoh can be accessed for reading or writing in any sequence. Each LBA directory structure 
(addresses ranging from 0 to N) is the addressing mechanism used to store and retrieve date blocks in the 
SCSI-FBA environment (some FBA devices also provkJe the capabOity to address the storage space using the 

so physical address). 

As can be seen from the preceding paragraphs, the principal problem facing a designerof a storage system 
using date compressfon techniques in the SCSI-FBA environment is to provkie a mechanism by whkii fixed 
size units of date, herein termed date btocks, in an input date stream can be recorded in a variable amount of 
storage medium space and still maintain addressability to the unoccupied storage space and provide for ad- 
55 dressabBity to the recorded date blocks. 

Since many optical disks today are of the removable type, it is further desired to enable each removable 
date storage medium to be self-descrit>ing as to compressed and uncompressed date held thereon. 
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DISCUSSION OF THE PRIOR ART 

The Vosacek US patent 4,499,539 shows first allocating a number of data storage segments of a cache 
or bufferfbr storing a maximum numberof data bytes that are storat)!e in an addressable track of adirect access 
5 storage device (DASD) connected to the cache or buffer. The DASD is a nriagnetic disk storage device. The 
protocol is to stage or transfer one track of DASD data to the cache or buffer in one input-output operation 
(one access to the DASD). Upon completion of the actual data transfer, the cache or buffer is examined. If 
less than all of the f vst allocated segments contain data, then the empty allocated segments are deallocated. 
Pointers are recorded in a first one of the allocated segments for pointing to additional allocated segments that 

10 store data from the same DASD track. In this nrmnner the DASD track is emulated in the cache or buffer. 

US Patent No 5097261 (application No USSN 07/441 , 1 26) shows a data compactran system for a magnetic 
tape peripheral data storage system. Tapes do not have any addressable data storage areas. The entire tape 
is formatted each time ft is recorded. This formatting feature in magnetic tapes enables storing variably sized 
records as variably sized blocks of data. The storage of uncompressed and compressed data is by addressable 

15 blocks of such data. The applicatfon does show including a plurality of records in one block of data recorded 
on the tape. Co-pending conrmmnty-assigned US patent application USSN 07/372,744, f Sed 6/26/89,(Attorney 
docket TU989003) shows a magnetic tape <teta storage system that automatically stores a plurality of snrall 
records in each block of recorded data. Each of the records remain individually addressable. A purpose of conf>- 
bining a plurality of records in one t>lock is to reduce the number of inter-block gaps for increasing the storage 

20 capacity of the magnetic tape. 

Data compression and decompressfon algorithms and systems are wefl known. US patent 51 09226 shows 
an in line (real time) data compressfon/deoompresston system for use in high speed data channels. This system 
uses an algorithm shown in the Langdon, Jr. et al US patent 4,467,317. Batch processed (software) data conv 
pression and decompression is also well known. PKWARE, inc., 7032 Ardara Avenue, Glendale Wi 53209 USA 

25 provides the software programs PKZIP for batch compression, PKUNZIP for batch decompressbn among 
other compressnn-deoompressfon software. Another data compressnn-deoompression algorithm has been 
used for both batch (software processing) and In-line (hardware-integrated semiconductor chips) processing. 
The known Lempei Ziv-1 data compresston/decompression algorithm is used for both in-line (real time) and 
batch data compressk>n and decompressfon. It is preferred to use the latter algorithm. Shah and Johnson in 

30 the article DATA COMPRESSOR DECOMPRESSOR IC in the '1990 IEEE Internationa] Symposium on Cir- 
cuits and Systems, New Orleans LA USA (pp 41-43) on May 1-3, 1990 describe an integrated circuit using the 
known L^mpel-Ziv algorithm mentioned above. In practicing the present invention, it is preferred that a com- 
pressfon-decompression algorithm that focflitates both batch and In line operations be used. Of course, only 
batch or only in line data compressk3n-decompressk)n may be used to successfully practice the present in- 

35 vention. 

Images or "hon-coded" data have been compressed and decompressed for saving data storage space. 
Reitsma US patent 4,622,585 shows one video oompressk>n scheme. 

SUMMARY OF THE INVENTION 

40 

The present invention provides flexible data compression-decompressfon controls that enable randomly 
accessing compressed data through relatively simple accessing mechanisms. 

According to a first aspect, the present invention provkies apparatus for storing data in compressed form 
in a data storage devk» having a plurality of addressable like-seed data storage areas, each for recording a 
45 pred^ermined number of data bytes, the data storage device being connected to means for receiving data to 
be recorded, sakJ received data being arranged in a plurality of addressable data blocks, characterised in that 
the apparatus comprises, in oomblnatton: selection means in the means for receiving data for selecting one 
or more data transfer units of data blocks to be recorded, each said transfer unit of data blocks having a given 
number of data bytes and including one or more of said addressable data blocks; compression means con- 
so nected to the selection nrteans for compressing saki transfer unit of data Nocks to be recorded as a group of 
compressed data blocks; data access means in the device connected to saki compressfon means for recording 
said group of compressed data btocks in sakl addressable data store^e areas as one continuum of contpressed 
data; and directory means indk:ating whk:h ones of said addressable data storage areas saki continuum of data 
is recorded in, and indicating that said continuum of data contains said selected transfer unit of data btocks in 
55 a compressed form. 

In a second aspect of the present inventfon there is provided a method of oompressir^ and recording onto 
a data stcuiage medium data of a file whk:h is arranged in a plurality of addressai)le date blocks, the method 
comprising the steps ot selecting a plurality of sakJ data Mocks of said f Oe to be compressed and recorded; 
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segmenting the selected plurality of addressable data blocks Into one or wore data transfer unSs; compressing 
each of said one or more data transfer units and recording them as respective separate groups of compressed 
data blocks; and creating and maintaining a file dbiectory indicating the address and size of each of saki re- 
corded groups for enabling random access to recorded data within said file of data blocks. 

5 Preferably, the file directory provkies informatk>n for addressing each of the compressed data blocks with- 

in a group, it is further preferred that the directory maintained in a host processor and is also stored on the 
data storage medhim containing the group of compressed data blocks. 

In a preferred embodiment of the present inventbn, a dataf De having a plurality of addressable data blocks 
is segmented into a plurality of groups of such data blocks. Each group of data blocks is separately compressed 

10 and decompressed as one unit of data. Each such group is separately transmitted between a host processor 
and a data storage unit. Gommunk:atk>ns link, etc as one data transfer unit (DTU). The size of the DTU, in terms 
of the nunrtber of data Mocks to be included, is determined empirically based upon the data storage capacity 
of (number of data bytes storable in) sectors into wh'ch a data storage volume is divided, the number of bytes 
in each of the data blocks of the data file, and other system parameters. The data storage of each group in 

IS compressed form in a data storage devk:e is described by the data storage system to the host processor, pre- 
ferably by a oonvnand linked to the host processor oonvnand effecting the data storage in compressed form. 
The host processor establishes a directory describing the storage of each and every group of the data file. If 
the data file is transferred to another system or host processor in the compressed form, the compressed data 
fae directory accompanies the compressed groups. Retrieving compressed data from a data storage device 

20 is by retrieving the group of data blocks having the data block(s) desired to be read. Each compressed group 
of data blocks is transferrable between host processors and data storage units without decompresston. The 
DTU or group-receiving data storage medium may be formatted in the well known fixed-block architecture 
(FBA) format, the well known count-key-data (CKD) fornriat, the well known extended oount-key-data (ECKD) 
format or any other format 

25 Embodiments of the present invention will now be descn1>ed in more dotal, with reference to the accom- 
panying drawings in whk:h: 

Fig. 1 is a flow chart iOustrating data storing operations according to an embodiment of the present Inven- 

tk)n; 

Fig. 2 is a simplified block diagram of a data processing system In which the data storing operatk)ns ac- 
30 cording to Fig. 1 may be advantageously emptoyed; 

Fig. 3 is a diagrammatk: representation of a Ijogical Block Address (LBA) directory for kJentif ying recorded 
compressed groups of data Mocks of a data file; 

Fig. 4 b a block diagram showing details of an optical data storage system attached to a host processor 
such as is shown in Fig. 2; 

35 Fig. 5 is a block diagram of a peripheral controller usable In data processing systems such as are shown 
in Rg's2and 4; 

Fig. 6 diagrammatically aiustrates storing a compressed group of data blocks according to the steps of Rg. 
1; 

Fig. 7 diagrammatically illustrates host processor commands using a SCSI connectkm to a data storage 
40 system such as is shown in Fig. 2 or Rg. 4; 

Fig. 8A diagra m ma t kadly illustrates a file directory of a plurality of compressed groups of data blocks of a 
f De according to an embodiment of the present inventton; 

F*^. 8B diagrammatM»lly aiustrates the format of a disk sector according to an embodiment of the present 
invention; 

45 Fig's9-13areflowchartsshowing detaa8 0f the operatton shown in Rg. 1; 

Fig. 14 is a logic diagram llustrating an applcatk>n of the present inventton to a multi-unit data processing 
system that has a plurality of data storage devices and host processors interconnected by a data link or 
local area network; and 

Fig. 15 is a flow chart showing machine operations that update a compressed data file according to an 
50 embodiment of the present inventton. 

DETAILED DESCRIPTION 

Referring now mote partk»ilarly to the appended drawings, like numerals indteate like parts and structural 
55 features in the various figures. A data fHe having a plurality of date blocks is divkled into one or nrwre transfer 
units of data blocks. Before data storage, each transfer unit of data blocks is sut)|ected to its own data com- 
pression cyde to create a group of compressed date blocks. The size of the date transfer unit, in bytes, is s&-' 
lected to be facfle for addressing and reeving indivkhial recorded groups of compressed date blocks while r 
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providing good channel utilization and compression efficiency. Also the data transfer unft size is selected in 
part t>ased upon data storage efficiency, i.e. the storage of the data, after compression, should fill several al- 
located addressat)le data storage areas. Each of the allocated sectors in each group is f flled to capacity except 
the last sector of a group that may be partially filled. It is desired to reduce the number of partially f Qled data 
5 storage sectors for more efficiently filling the FBA data storage d»k with data. This desire b balanced with 
enabling efficient random access to the compressed data blocks stored on the FBA data storage disk. 

Each stored or recorded group of compressed data blocks is accessed from disk 30 as a single data unit 
irrespective of the number of disk 30 sectors in whk:h the group is recorded. Since each group of compressed 
data bk)cks is compressed in a separate data compresston operatton, all of the data in each such group must 
10 be decompressed starting with the beginning, i.e. first compressed bytes, in each group. Therefore, in randomly 
accessing a compressed desired data block in a given group, all of the compressed data blocks of each stored 
group are read from disk 30 as a single disk record. The single disk record is decompressed up to the desired 
or addressed compressed data block. The desired compressed data block is then decompressed for process- 
ing. Limiting the size of the groups of compressed data t)locks provides for quk:ker access to any desired conr>- 
15 pressed data block. This desire is balanced with a desire to maximize utDizatk)n of the disk 30 data storage 
space. An example of managing these two parameters for creating a facie size group of compressed data 
blocks (that varies with each application) is described later. 

In an alternate arrangement each data block Is separately compressed. Aplurality of such separatdy com- 
pressed data bbcks are combined into a single disk record. The byte position within the single disk record for 
20 each of the separately compressed data blocks is recorded in the single disk record. Such byte posttion or offset 
enables addressing each of the compressed data blocks within a group. 

To fadlitatB access bo the groups of compressed data bk>cks, the host processor program maintains a di- 
rectory that klentifies the addressat>le data storage areas containing the group as well as the data blocks in 
the respective groups. This directory identifk:ation preferably takes the form of a file directory that is main- 
25 tained in host processor 11 . Such directory is also stored on the volume or data storage disk containing the 
group(s) of compressed data blocks. Prelierably, the directory is transmitted to the disk device as a part of each 
transfer of a compressed file having plural groups of compressed data bkx:ks. This arrangement establishes 
on the FBA disk a directory that effects addressabflity of the compressed data blocks within the respective 
groups. 

30 Fig. i illustrates recording a data file by grouping a plurality of data blocks of the file into a smaller number 
of groups of compressed data blocks. Step 10 is executed in a host processor 11 (Fig. 2). A data ffle, or part 
of a data file, is identified for compressed data storage. The data f Oe consists of a plurality of data bkxdcs. The 
term data block includes data records (coded data). sut>-f ile structures, individual images, graphs and the Ike, 
drawings and other forms of graphics, conibined graph ks (norv-ooded data) and text(coded data), and the Ike. 
35 As later detaied, the data file is divided Into facOe sized groups of data blocks for transfer as a data transfer 
unit DTU to a storage unit or over a oommunicatkMi link and for maintaining a random access capability to the 
recorded groups of compressed data bk>cfcs. The size of each DTU and resultant recorded group Is dependent 
on diverse variables, as will become apparent Completion of one execution of step 10 results in one such group 
6( data blocks being selected for compresston and storage. 
40 Step 1 3 is executed k>y host processor 11 (Fig. 2). The numt)er of uncompressed data bytes in the DTU of 

data blocks (the product of the number of data btocks times the number of bytes in each data bk>ck) is divided 
by the data storage capacity of one addressable data storage area (sector of an FBAformatted disk) and round- 
ed to a next higher integer if the product includes a fraction. This number represents a maximum numt>er of 
addressat)le data storage areas required to store the data; either uncompressed or if a compressk>n does not 
45 compress the data into fewer tyytes for storage. At this Juncture, it is not known how many addressable data 
storage areas are required to store the group of data blocks after oompressnn. To ensure that the group of 
data btocks is storable on the data storey n^ium (optical disk 30 is used in the illustrative embodiment), a 
number of the addressable data storage areas sufficient to store the entire group of compressed data blocks 
is initially deternruned for storing the group of data blocks in an uncompressed form. 
50 Step 15 is executed by both the host processor 11 and datastorage system 1Z The selected DTU of data 
Knocks is transmitted by the host processor to the data storage system. The data compression of the selected 
DTU of data bk>cks is compressed before storage on the data storage medium 30 (Fig. 4). There are several 
methodologies that may be emptoyed herein. The Rg. 1 indk:ated methodology requires the data stores sys- 
tem to allocate the maximum number of addressat>le data storage areas. Then the data transfer occure requir- 
es ing the data storage system to compress the selected DTU of data bk)cks just before the data are recorded 
on the data storage medium 30 (Fig. 4). Upon completion of the compression and data storage or recording 
as one continuum of data, data storage system 12 determines the numt)er of addressatrie data storage areas 
actually used to store the compressed jgroup of data blocks. The unused but allocated addressable data storage 
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areas are then deallocated. In the event that certain data blocks compress to a greater numt>er of bytes than 
the original or uncompressed data, then, as w9l become apparent, the data compression step Is not used. Con- 
trol data are recorded on the FBAdlsk that Indicates which data are compressed and which data are not com- 
pressed. Such control data are used in retrieving data from the data storage (FBA) disk, as wll become appa- 
5 rent As later detailed in this specification, at step 16 data storage system 12 sends the storage locatbns of 
the just-recorded group of compressed data blocks to the host processor 11 for inclusion in a directory of the 
data file to which the recorded group of data blocks is a member. 

A second methodology has the data compression-deoompresston performed in host processor 11. As 
such, host processor 11 includes the data compressk>n mechanism, either software or hardware, and sends 

10 the compressed selected group of data blocks to data storage system 12 for storage. In this Instance, If batch 
oompresston is used, host processor determines the number of addressable data storage areas required for 
storing the compressed group of data bk>ck5. Host processor 11 then sends the requved number of address- 
able data storage areas to data storage system 12 for allocation just before the compressed data are trans- 
mitted to the data storage system. 

15 In a third methodology, the uncompressed group of data blocks are transmitted twice by host processor 
11 to data storage system 12. A first transmisskm enables data storage system 12 to accurately measure the 
. number of addressable data storage areas that will be required to store the compressed data. In the first trans- 
mission the data are compressed but not recorded. The number of compressed data bytes are counted to de- 
termine the data storage extent (number of sectors or addressable are^) for the compressed data. The data 

20 storage system 1 2 then allocates the indk»ted number of contiguous sectors for receiving and storing the conv 
pressed data. Asecond transmisston of the same data to the data storage system 12 results in the compresston 
and storage of the compressed data in a data storage medium. 

In each of the above described methodologies. If the number of bytes in the compressed file is greater 
than the number of uncompressed data bytes, then the data are recorded in the uncompressed form. Further. 

25 when updating a group of compressed data blocks, the number of compressed data bytes may exceed the 
capacity of the currently allocated sectors. As described later with respect to Fig. 15, a change in allocatton 
of sectors fbr storing the updating DTU may be required. 

Also, in each of the above described methodologies, the data blocks to be compressed and stored from 
each DTU are preferat>ly compressed and stored as one group. That is, all data blocks in each DTU are conrv 

30 pressed during one data compression cyde to produce one group of compressed data blocks. An alternate 
data comprBssk>n approach is to individually compress each of the data bk>cks in each DTU. Then the group 
of compressed data blocks consists of a plurality of indivkiually compressed data bkicks. In the alternative data 
compressbn, a header in each group can kientify the byte offset within each group of the individually conrv 
pressed data blocks. Such indivkiually compressed data blocks may also be kientif led on the data recording 

35 disk by niegal recording code characters, such characters are well known fbr diverse data recording codes. 

Host processor 11 in step 19 logk»lly associates ail recorded groups of compressed data blocks via a later 
described f Be directory. When employing the above described first methodotogy, upon storing the compressed 
data, data storage system 12 reports to host processor 11 the actual number of sectors used to store the com- 
pressed data and further address and klentifying data therefore, as will be described. 

At step 20, host processor 11 deteonines whether all of the data to be compressed and recorded have 
been recorded. The details regarding the recorded group of compressed data blocks (see step 19) have been 
entered into the later described file directory (Fig. 8A). If all of the above described machine operations have 
l>een completed, then the operation is "done", enat>ling exiting to other machine operations beyond the present 
description. Otherwise, steps 10-19 are repeated as atx)ve described unta all of the data have t>een com- 

45 pressed and recorded. It b to t>e noted that other machine operatmns may t>e performed by host processor 
11 in a multitasking or interrupt cbiven data processing environment while steps 10-19 are in the process of 
execution as is known in the data processing art 

Fig. 2 shows a data processing system in simplified form. Host processor 11 attaches a data storage sys- 
tem 1 2. Data storage system 12 includes a peripheral control 20 that connects host processor 11 to data storage 

so devk» 21 . Device 21 , in one embodiment of this invention, is a magneto-optical data storage device that op- 
erates with removat>le magneto-optical data storage media or a single medium (disk). As later used in this spec- 
ification, the term programmed machine includes host processor 11 , peripheral controller 20 amJ programmed 
portions of data storage devk» 21 . The compressk)n-deoompressk>n mechanisms are pr^erably in the pro- 
grammed machine. For in-line compresston-deoompresston, it is preferred that the compressk>n-deoompres- 

S5 skm ocoirs in the peripheral controner 20. As later described with respect to Fig. 14, th^ 

pression-deoompressnn mechanism can be anywtiere in the programmed machine. For batch oompresston- 
deoompression it is preferred to place the oompressk>ri-decompressk>n in host processor 11. 

Fig. 3 illustrates a bgtcal block address (LBA) structure 23 used in niagnetK^ . 
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terns for addressing sectors of an optical disk. LBA 23 is a logical to real address translation medianism that 
enables full advantage of practicing the present invention. This sector addressing is based upon the logical 
addressing found in many present day optical disk data storage devices. The attaching host processor 11 ad- 
dresses data on disk 30 (Fig. 4) using a logk»t btock address included in LBA 23. LBA 23 determines which 
5 of the addressable physical data storage addressable areas, such as sectors, are addressed by the respective 
LBA address. In an alternate addressing arrangement, host processor 11 requests access to a named f Se. This 
alternate addressing arrangement includes host processor 11 identifying byte location wfthin the f Se to begin 
a data operation and a number of bytes (byte length) to be subjected to the data operatk>n, i.e. read from the 
disk, for example. 

10 LBA 23 is managed by either one of two algorithms. A first one has been used for optical disks. In this 
algorithm, the number of entries in LBA 23 is constant for each disk and is based upon the number of address- 
able entities in the disk designated for storing data. Spare addressable data storage areas or sectors are not 
included in the LBA 23 logical address sequence, as Is known. Known secondary pointers enable addressing 
spare sectors via LBA 23. 

IS A second algorithm for addressing using LBA 23 Is used in magnetic flexible diskettes. In this second al- 
gorithm, the address range of LBA 23 varies with the number of demarked or unusable sectors. LBA 23 iden- 
tifies for addressing only the tracks and sectors that are designated for storing data. In the event one of the 
sectors kfentif iable by the Olustrated address translation becomes unusable, then the unusable or defective 
sector is skipped and replaced by another sector. Such substitution is well known. 

20 All of t he addressable tracks and sectors on disk 30 are addressed via LBA 23. Such addressing is a table 

kK)k up matching the host processor 1 1 supplied logical address to a physical disk track and sector storing the 
data identified by the supplied logical address. Each LBAIogkal address has one entry 14 in LBA 23. 

Numerals 17 and 18 indicate groups of compressed data blocks recorded on disk 30 using the present 
invention. Numeral 17 indicates the first group of compressed data blocks of one fDe. Numeral 18 indicates 

25 subsequentty recorded groups of compressed data blocks from the same file. The enumeration of the data 
blocks in the recorded groups 17-18 is maintained in its original sequence as generated by host processor 11. 
As will become apparent, the compressed data blocks in the respective groups are identified in a f He directory 
shown in Fig. 8A. 

A magneto-optic data storage drive or device 21 is iHustrated in Fig. 4 as it is connected to host processor 

30 11 via peripheral controller 20. As usual, peripheral controller 20 is packaged with the optical disk drive. Amag- 
neto-optic record disk 30 is removeably mounted for rotation on spindle 31 by motor 32. A usual disk cartrklge 
receiver (not shown) Is in operative relation to spindle 31 for inserting and ^'acting magneto optical or other 
optical dbks 30 into and from drive 21. Optical portion 33 of drive 21 is nKMjnted on frame 35. A headarm car- 
riage 34 n>oves radially of disk 30 for carrying an objective lens 45 from track to track. A frame 35 of recorder 

35 suitably nKXjnts carriage 34 for reciprocating radial motior^. TTie radial motions of carriage 34 enable access 
to any one of a plurality of concentric tracks or circumventions of a spiral track for recording and recovering 
data on and from the d»k. Lmear actuator 36 suitably nKxinted on frame 35, radially moves carriage 34 for 
enabling track accessing. The recorder is suitably attached to one or more host processors 11 , such host proc- 
essors may be control units, personal computers, large system computers, communication systems, image sig- 

40 nal processors, and the like. Attaching circuits 38 provide the logical and electrical connections between the 
optical recorder and peripheral controller 20. 

Device microprocessor40 controls device 21 including the atiachment circuits connected to peripheral con- 
troller 20. Control data, status data, commands and the like are exchanged between attaching circute 38 and 
device microprocessor 40 via bkjirectional bus 43. Included in mk:ro-processor 40 is a program or microcode- 

45 storing, read-only memory (ROM) 41 and a data and control signal storing randono-acoess mennory (RAM) 4Z 
The optics of the reoorder(drive or devk») 21 include an objective or focusing lens 45 mounted for focusing 
and radial tracking motions on headarm 33 by fine actuator 46. This actuator includes mechanisms for moving 
lens 45 toward and away from dtek 30 for focusing and for radial movements parallel to carriage 34 motions; 
for example, for changing tracks within a range of 100 tracks so that carriage 34 need not be actuated each 

so time a track adjacent to a track currenfly being accessed is to be accessed. Numeral 47 denotes a two-way 
light path between lens 45 and disk 30. 

In magn^o-optic recording, m^netic bias field generating coil 48. In a constructed embodiment electro- 
magnet provides a weak magnetic steering or bias field for directing the remnant magnetization direction of a 
smaO spot on disk 30 aiuminated by laser light from lens 45. The laser light spot heats the Qhiminated spot on 

55 the record disk to a temperature above the Curie point of the m^netxM>ptic layer (not shown, but can be an . 
altoy of rare earth and transitional metals as taught by Chaudhari et al., USP 3,949,387). This heating enables 
magnet coil 48 generated bias field to direct the remnant magnetization to a desired direction of magnetization 
as the spot cools below the Curie point temperature. Magnet col 48 is shown as supplying a blais fiekt oriented 
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in the "write" direction, i.e., binary ones recorded on dek 30 normally are 'north pole remnant magnetization'. 
To erase disk 30, magnet ooi 48 supplies a field so the south pole is adjacent disk 30. Magnet coO 48 control 
49 is electrically coupled to magnet coil 48 over line 50 to control the write and erase directbns of the coD 48 
generated magnetic f iekJ. Microprocessor 40 supplies control signals over line 51 to control 49 for effecting 

5 reversal of the bias field magnetic polarfty. 

It is necessary to control the radial position of the beam following path 47 such that a track or circumvo- 
lution is faithfully folk>wed and that a desired track or circumvolution is quickly and precisely accessed. To this 
end, focus and tracking drouits 54 control both the coarse actuator 38 and fine actuator 48. The positioning 
of carriage 34 by actuator 36 is precisely controlled by control signals supplied by circuits 54 over line 55 to 

10 actuator 36. Additionally, the fine actuator 46 control by circuits 54 is exercised through control signals travelling 
to fine actuator 46 over lines 57 and 58. respectively for effecting respective focus and track following and seek- 
ing acttons. Sensor 58 senses the relative position of fine actuator 46 to headarm carriage 33 to create a re^ 
ative position error (RPE) signal. Line 57 consists of two signal conductors, one conductor for carrying a focus 
error signal to circuits 54 and a second conductor for carrying a focus control signal from circuits 54 to the 

f 5 focus mechanisms in fine actuator 46. 

The focus and tracking positton sensing is achieved by analyzing laser light reflected from disk 30 over 
path 47, thence through lens 45, through one-half minor 60 and to be reflected by half-mirror 61 to a so-caOed 
"quad detector" 62. Quad detector 62 has four photoelements which respectively supply signals on four lines 
collectively denominated by numeral 63 to focus and tracking circuits 54. Aligning one axis of the detector 62 

20 withatrackcenterline,traGkfollowingoperationsareenat>led. Focusing operations are achieved by comparing 
the light intensities detected tyy the four photoelements in the quad detector 62. Focus and tracking circuits 
54 analyze the signals on lines 63 to control both focus and tracking. 

Recording or writing data onto disk 30 is next described. It is assumed that magnet 48 is rotated to the 
desired position for recording data. Microprocessor 40 supplies a control signal over line 65 to laser control 66 

25 for indicating that a recording operation is to ensue. TTi is means that laser 67 is energized by control 66 to emit 
a higb-intensity laser ligtit beam for recording: in contrast, for reading, the laser 67 emitted laser light beam is 
a reduced intensity for not heating the laser lluminated spot on disk 30 above the Curie point Control 66 sup- 
plies its control signal over line 68 to laser 67 and receives a feedback signal over line 69 indrcating the laser 
67 emitted light intensity. Control 68 adjusts the light intensity to the desired value. Laser 67, a semiconductor 

30 laser, such as a galliunrvarsenkje dkxie laser, can be modulated by data signals so the emitted light beam rep- 
resents the data to be recorded by intensity modulation. In this regard, data circuits 75 (later described) supply 
data indk»ting signals over line 78 to laser 67 for effecting such nrnJulation. This nrKXlulated light t>eam passes 
through polarizer 70 (linearly polarizing the beam), thence through oollimating lens 71 toward half mirror 60 
for t>eing reflected toward disk 30 through lens 45. Data circuits 75 are prepared for recording by the micro- 

35 processor 40 supplying suitable control signals over line 76. Mk:roprocessor 40 in preparing drcuits 75 is re- 
sponding to commands for recording received from a host processor 11 via attaching drouits 38. Once data 
circuits 75 are prepared, data is transferred directty between peripheral controller 20 and data circuits 75 
through attaching drcuits 38. Data drcuits 75, also andliary cnrcuits (not shown), relating to disk 30 format sig- 
nals, error detectk>n and correction and the like. Circuits 75, during a read or recovery action, strip the ancillary 

40 signals from t he readback signals before supply corrected data signals over bus 77 to peripheral controller 20 
via attaching circuits 38. 

Reading or recovering data from disk 30 for transmission to host processor 11 requires optical and electrical 
processing of the laser light beam from the disk 30. That portion of the reflected light (which has its linear po- 
larization from polarizer 70 rotated by disk 30 recording using the Kerr effect) travels atong the two-way light 

45 path 47, through lens 45 and half-mirrors 60 and 61 to the data detection portion 79 of the hesdarm 33 optics. 
Half-mirror or beam splitter 80 divkies the reflected beam into two equal intensity beams both having the same 
reflected rotated linear polarization. The half-mirror 80 reflected light travels through a first polarizer 81 which 
6 set to pass only that reflected light which was rotated when the remnant magnetization on disk 30 spot being 
accessed has a "north" or binary one indication. This passed light impinges on photocell 82 for supplying a 

so suitable indicating signal to differential amplifier 85. When the reflected light was rotated by a "^south' or erased 
pole direction remnant nnagnetization, then polarizer 81 passes no or very littie light resulting in no active signal 
being suppi led by photocell 82. The opposite operatton occurs by polarizer 83 which passes only "south" rotated 
laser light beam to photocell 84. Photocell 84 supplies its signal indk:atir^ its received laser light to the second 
input of differential amplifier 85. The amplifier 85 supplies the resulting difference signal (data representing) 

55 to data drcuits 75 for detection. This detection, in the fliustrated embodiment, does not indude digital demod- 
ulation (decoding the read back signals from a 1-7 d-k code to data in a host processor format). The detected 
signals indude not only data that is recorded but also all of the so-called andliary signals as well. The term 
"data" as used herein is intemied to indude any and an information-bearing signals, pr^ierably of the digital 
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or discrete value type. 

The rotational position and rotational speed of spindle 31 is sensed by a suitable tachonneter or emitter 
sensor 90. Sensor 90» preferably of the optical-sensing type that senses dark and light spots on a tachometer 
wheel (not shown) of spindle 31, supplies the tach' signals (digftal signals) to RPS drcurt 91 which detects 
5 the rotational position of spindle 31 and supplies rotational information-bearing signals to microprocessor 40. 
Microprocessor 40 employs such rotational signals for controlling access to data storing segments on disk 30 
as Is wkJely practiced in the nragnetic data storing disks. Addittonally, the sensor 90 signals also travel to spindle 
speed control circuits 93 for controlling motor 32 to rotate spindle 31 at a constant rotational speed. Control 93 
ntay include a crystal-controlled oscillator for controlling motor 32 speed, as is well known. Microprocessor 40 
10 supplies control signals over line 94 to control 93 in the usual manner. 

Peripheral controller 20 is shown in Fig. 5 This controller includes the compression-decompresston mech- 
anism for in-line or real time data compresston-decompression. A connection between host processor 11 and 
peripheral controller 20 is effected by a SCSI module 100 that implements the known small computer system 
interface. An lO data buffer 103(dynamic8lly allocated into input data buffers and output data buffers using 
IS known techniques) temporarfly stores data received from or to be transmitted to the host processor 1 1 . An Opt- 
k»l Disk Controfler (ODC) 104 manages the reading and writing of the data to the disk 30 (Fig.4). Error Cor- 
rection Control (ECC) module 106 detects and corrects errors in data being read and generates ECC error 
detection and correction redundancy characters to be written to the medium with the data. Run Length Limited 
(RU.) (mod-demod) encoding and decoding is performed in data circuits 75 (Fig. 4). Such niod-demod encodes 
20 and decodes recorded data patterns, such as used in the known 1-7 d-k code. Microprocessor 1 07 (plus control 
store 108 and dynamic store 109) controls the various elements of the controller 20. A Compresston/Decom- 
presston (CD) module 101 , such as an integrated circuit referred to by Shah et al, supra, implements the com- 
pression algorithms. CD module 101 includes automatic circuit timing and control, as is known, to control data 
flow through peripheral controller 20 under supervision of microprocessor 107. This compression-deoompres- 
25 sk}n is in real time (in-line) with thedata transfer. Busses 102, llOand 111 interconnect the modules, asshown. 
Controller 20 is preferably packaged with a devk» 21 on a common frame. 

Fig. 6 illustrates oompresskm of several data bk>cks into one group of compressed data btocks recorded 
in a numt>er of data storing sectors 118 of track 117 of disk 30. A group 115 of a plurality of data blocks 116 is 
selected for recording as described with respect to Fig. 1 . Group 11 5 of compressed data blocks is transmitted 
30 to controDer 20 by host processor 11 . CD 1 01 in controller 20 compresses group 115 suff k^ntly to be recorded 
as a group of compressed data blocks in sectors 118 plus about one-half of sector 119. The remaining half of 
last sector 119 Is filled with padding bytes, as is known. Nunneral 122 indicates a sector that was allocated 
previously. Numeral 1 23 indicates a next sector(s) that were inrtiaOy allocated according to the above-described 
first methodology. The linked response of controller 20 to the write-compress command indicates to host proo- 
fs essor 11 that sector(s) 123 are to be dealbcated as such sectors did not receive any of the data from group 
115. Host processor 11 responds to controller 20 to deallocate sectors 123. 

The above description assumes that host processor 11 is perfiormtng data space managen^nt This ar- 
rangement is usual. It is to be pointed out that in a multi-host arrangement of sharing device 21 that one of 
the hosts may be designated to perform space management Also, In some systems the peripheral controller 
40 performs data storage space management 

Fig. 7 fliustrates in abbreviated form three commands for use in a known SCSI Interface. WRITE conrvnand 
130 includes the operation code field 131 that indicates the command is a WRITE command. LBA address 
fiekJ 132 indicates the first LBA address that data being transmitted in accordance with the instant WRITE 
command is to begin (the lowest LBA address of possibly several LBA addresses required to be used in storing 
45 data intoa plurality of disk 30sectors). Reld 133 indicates the numk>erof unitsof data that are to be transferred 
finom host processor 11 todevk»21 for storage on disk 30. One unit is that data storable in one sector of the 
dtek 30. FBA disks nuiy have different data storing capacity sectors, such as 512, 1024 (1 kb), 2048, or 4096 
bytes of data Field 134 indicates whether or not the data to be bBnsmitted is to be compressed. Field 135 
indicates that this WRITE command is linked to read buffer command 140. This command linkage requires 
so peripheral controller 20 to report to host processor 11 the detaOs of the data storage. i.e. number of sectors 
actually used, the data that enables host processor 11 to buikl an entry for the later described Fig. 8A illustrated 
fie directory, and identifies the sectors to be deallocated. It is noted that I.BA 23 is updated in host processor 
11 with a copy thereof recorded in a sector of disk 30. Also, a copy of the Rg. 8A Olustrated file directory is 
recorded on disk 30, preferably in a uncompressed form at a first LBA 23 logical address that immediately pre- 
ss cedes the fast LBA address for storing compressed data 

Read buffer SCSI command 1 40 includes operation code field 141 that indicates the command is a READ 
BUFFER conrvnand. Controller 20 responds to receipt of a READ BUFFER command to transfer data from an 
output r^ister(s) of 10 buffer 1 03. Controller 20 stores the information relating to storing a group of compressed 
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data blocks in such output buffer 103 register(s) in preparation to respond to the READ BUFFER command 
linked to the WRITE command 130. Field 142 indicates to oontroller 20 the number of sectors used to store 
the compressed data blocks. That is, host processor 1 1 knows the number of disk sectors required for storing 
the compressed data blocks, hence the new entry for the Rg. 8A illustrated f Oe directory. 

5 READ DATA oonunand 1 45 has operatk)n code field 146 having an indicatk>n that the command is a READ 

DATA command. The first LBA address to be used for transferring data from disk 30 to host processor 11 is 
indicated in field 147. Field 148 indicates the number (n) of data blocks requested or conrmianded to be trans- 
fer red from the FBA disk to the host processor 11. Field 149 indicates to oontroller 20 the number (N) of disk 
sectors that are to be read. Reld 150 indicates that decompress is either on or off. Link on bit 151 is usually 

fo reset to be inactive. For reading one group of compressed data blocks, controller 20 reads the indicated number 
(N) of sectors, decompresses the data blocks, then transfers the decompressed data blocks to host processor 
11 . Controller 20 counts the number of data Mocks transferred such that when the indk»ted number n of f ieM 
148 is reached, the data transfer is terminated. The data block counting is also used as an integrity check. 
The Fig. 8A illustrated fie directory can indicate different levels of detail, the selected level is applicatton 

f 5 dependent Every file that has data btocks recorded in groups of compressed data blocks has a separate por- 
tion of the directory respectively indk:ated by numerals 161, 162 and 163 for three different data files. Each 
row 160 of each directory represents one entry. Afirst entry in each directory include in column 164 the fOe- 
name of the file and the LBA address at which the directory is recorded on disk 30. Column 1 65 in the first or 
top most entry indicates the number of data blocks in each data transfer unit The term data transfer unit (DTU) 

20 indicates that a given number of data blocks are to be transferred between disk 30 and host processor 1 1 during 
each data transfer. The remaining entries 160 are respectively for the transmitted and recorded groups of com- 
pressed data bk>cks. Again, column 164 in the respective entries indicates the first LBA address used to store 
the group. Column 165 indk:ate8 the number of data blocks recorded and the number of sectors used to store 
the respective groups of compressed data blocks on disk 30. Once all of the data blocks are compressed in a 

25 ' single data compress operatwn, the group of compressed data blocks are a continuum of data with no external 
indication of the data block boundaries. The decompression mechanism and associated controls klentify the 
data block boundaries after deoompresston, as is known. 

In addition to the informatk>n contained in the Fig. 8A illustrated file directory, additional details of each 
group may be provided. In such an alternate implementation of the f 3e directory, controller 20 returns, in ad- 

30 dition, for each group of compressed data btocks (i.e. for each respective entry of the Fig. BA illu^ted file 
directory) a map of the relation of data bkx:ks and data storing sectors (uses the LBA k)gk:al address, not the 
actual physk:al locatkm on disk 30) for each of the groups. This additional information is used by the host to 
manage the recorded data and unused disk 30 sectors indicated in LBA 23. 

All entries contain the above indicated mapping of data blocks to LBA addresses for each and every group 

35 (Gp.) of compressed data btocks in the current f He. That is, each data block is indk^ated as being recorded in 
one or more sectors, depending on the compression and size of the data blocks. Several compressed data 
brfocks may be recorded in one sector. In this instance, the LBA addresses are the same for staling and ending, 
i.e. LBAio to LBAio for example could occur for several data blocks. 

A format of the Fig. 8A Qlustrated directory using the additional addressing information is set forth below. 



First entry 


Filename 


Number of data blocks in a data transfer unit 


Second entry 


Gp. 1 LBA 


Number of data blocks and sectors in this group 




data bfock n 


LBANatbyteB 




data block n+l 


LBANatbyteB^ 




data bfock n'i-2 


LBANsatbyteBs 



(Map of ail data blocks in group (Gp.) 1 continues, term "tyyte" indicates byte displacement of the respective 
compressed data block as recorded in a sector.) 



Third entry 



Gp.2LBA 



Number of data blocks and sectors 



55 (Map of data blocks to LBA addresses is set forth at>ove) 

The super scripts merely indicate 1st (no super script), second, etc byte posftkms of the respective blocks, 
n. n+1. rH-2ela. 

The atxyve described directory structures enable the data contents of a single group of compressed data 
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blocks to be updated without the necessity of reading and then rewriting the entire fie. An update of a group 
of compressed data block only requires the reading of one group of compressed data blocks. The update group 
of compressed data blocks may require more sectors for storage than that use to store the previous generatton 
group of compressed data Mocks. TTiat is addlttonal sectors have to be allocated. Since it Is desired that each 

5 group of compressed data blocks are recorded in contiguous sectors (except of unaddressable intervening de- 
fective sectors), a new alk>cation may be required. All of this activity is explained later with respect to Rg. 1 5. 
Host processor 11 uses this information to determine the next available set of contiguous LBA addresses that 
have sufficient number of addresses (sectors) for storing the updated group of compressed data blocks. 
For WORM (write once, read many) optical disks, the host processor may issue a MEDIUM SCAN com- 

10 mand to locate the next available LBA addressed sector for storing the updated group of compressed data 
blocks. Host processor 11 saves this information in an expanded directory entry for use when the data are to 
be retrieved or read. 

As later described with respect to Fig. 10, another control parameter is a minimum or maximum number 
of sectors to be used in the CKD and ECKD examples for practidng the present invention. The number N of 

15 sectors required to store the uncompressed data is compared with a MIN (minimum value) and a MAX (max- 
imum value). If the number of requred sectors Is between the MIN and MAX values, then a DTU is made using 
the numt)er N. MIN ensures a reasonable usage of disk storage space wh3e MAX ensures a reasonable access 
to compressed data blocks. If N greater than MAX, then N is made equal to MAX. If N is less than MIN, then 
N B made equal to MIN. The number of data bytes in a DTU is N*SB (SB is number of bytes storat>le in one 

20 sector) for FBA devices and N«DB (DB is number of data bytes desired for storing one data tHock) for CKD 
and ECKD devices. The number of bytes in a DTVI is stored in the first or top entry 160 (Fig. 8A) of each file 
directory. As one variation, field 166 in each of the entries 160 contains a compress DTU indicating bit Clf C 
is unity, then the data represented by the respective entry 160 are recorded in a compressed form. If bit C is 
zero or nil, then the data are recorded on disk 30 without data compression. The compressed bit C may also 

25 be recorded in each and every sector storing data in accordance with the present invention. 

Fig. 8B diagrammatically Olustrates format of a disk sector of an FBA disk. Sector 170 is in track 169 of 
disk 30. Intersector gap 171 separates sector 170 from an immediately preceding sector (not shown). Sector 
ID 172 B an embossed area that contains the track and sector address of sector 170. Intrasector gap 173 sep- 
arates the hard sectored or embossed mark 172 from the magneto-optically recorded portion that constitutes 

30 the remainder of sector 170. Data synchronization signals DATA SYNC 174 are magnet{M>ptically recorded 
with the data stored in portion 175 of sector 170. Control area 176 stores magneto-optically recorded control 
signals, as may be desired. A compress bit C 177 (oonskJered a part of the control signals In area 176) if set 
to unity indicates that the data in portion 175 are compress. If C 177 is set to zero or nil, then the dato stored 
in portion 1 75 are not compressed. Sector 1 70 ends with the error detection and correction redundancy in ECC 

35 178 portion. ECC 178 stored signals are generated and stored In a known manner that is not pertinent to an 
understonding of the present inventton. Intersector gap 179separatesseclor170fromanext8ucceeding sector 
180. It is preferred that compress bit 177 be used while practicing the present Invention. 

Fig. 9 is a flow chart showing a sequence of machine operations for storing a f De in a plurality of groups 
of compressed dato blocks wherein each group is separately transmitted from a host processor to a dato stor^e 

40 system as a DTU having a number of uncompressed bytes as set forth at>ove. At step 185 the date to be re- 
corded is analyzed for determining the number of DTU's to be generated. The actual size in bytes/date btocks 
of a DTU may be different from file to fDe. In step 186, the DTU size is nradified to accommodate the number 
of date t>locks to be initially recorded for equalizing the sizes of a plurality of DTU's to be used. For example, 
if the nunrtber of date blocks to be compressed and recorded is less than two desired DTU's and one half of 

45 the number of date blocks resutts in a number of date bytes greater than MIN, then two DTU's each having 
one-half of the date blocks are created. This same principle is applied to transferring date bkxdcs having any 
number of DTU's except for upcteting a recorded group of compressed date blocks, as will k>eoome apparent 
If the DTU sizes cannot be equalized, then a last DTU may have a number of bytes less than the MIN (minimum) 
number of bytes. Upon updating the recorded group of compressed date btocks resulting from a small last DTU, 

so a DTU is generated that adds a number of date Uocks to nreke the size of the DTU, hence group of compressed 
date blocks, larger to meet the DTU size requiremente set forth with respect to Fig's 9 and 1 1 . Rg. 1 5 relating 
to updating a recorded group of compressed date btocks illustrates machine steps for storing an updated DTU 
that is too large for the current altocated date storage space for a recorded group of compressed date blocks 
resulting from compressing and storing the updated DTU. 

55 AfOETDTU" step 188 (Fig. 9), a DTU of date blocks is buatftv date fransfer. Step 189 transfers the DTU 
to date storage sy^em 12. Date borage system 12 compresses and stores the transferred DTU as descrit)ed 
earlier. At ^p 190, host processor 11 ascertains whether another DTU is to be transferred. If not (DONE = 
1 ), then host processor 11 exite for perft>rnrung other work not related to practicing the present invention. Other- 
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wise, steps 188 and 189 are repeated until all DTU's have been transmitted to data storage unit 12. 

Fig. 10 is a flow chart showing selecting a MIN and a MAX value respectively for image (non-coded or 
graphics) data and t^ (coded) data. The compressibaity of data is a measure for selecting MIN and MAX. In 
this regard, each file of image or text data may compress substantially different from data from other files as 

5 well as changing from data block to data block in either type of data, Image or text Once a first group of data 
blocks have been compressed and reoofded as a group of compressed data blocks, the compressk>n ratio may 
be recorded in the Rg. 8A illustrated fOe directory as a reference for subsequent compression and storage of 
data blocks. The Fig. 10 illustration assumes that the image data has been compressed 75% (compressed im- 
age data blocks are 25% of origirmi size) and text data blocks have been compressed about 50%. These meas- 

10 ured values may be changed for calculation purposes for adding a margin of error aoconvnodation into the 
calculations. 

Step 195 determines whether the data in the file is text or image. If Image, step 198 calculates the MIN 
value as 4*SB (bytes in a sector), i.e. at least four sectors are to be used for storing a group of compressed 
data blocks. The number four is selected in an arbitrary manner. Sector size affects the minimum number of 

IS sectors to be used. Step 197 calculates MAX as being 64*SB. In a FBA disk having 1024 byte sectors, then 
the maximum DTU size is 64 KB (Kilobytes). Again, system considerations may change these values. Such 
oonsklerations are beyond the present description. From step 195. for text data (IMAGE DATA= NO), step 200 
calculates MIN as 2*SB while step 201 calculates MAX as 32*SB. The number of uncompressed bytes for Im- 
age data in MIN and MAX is equal to the number of uncompressed bytes for text data. The different compres- 

20 sion ratk>s change MIN and MAX values inversely to the expected compression ratx>. Upon completing either 
calculation, host processor 11 stores the MIN and MAX values in the first entry 160 (Fig. 8A) of the appropriate 
fie directory and then exits the calculation. 

The MIN and MAX values may also be predetermined and included as parameter data defining a class of 
data as set forth in Gelb et al US patent number 5,016.060 titled 'ALLOCATING DATASTORAGE SPACE OF 

25 PERIPHERAL DATA STORAGE DEVICES USING IMPUED ALLOCATION BASED ON USER PARAME- 
TERS". Gelb et al teach that data set parameters implidtiy control peripheral data storage operations. Such 
implicrt control based on data base or file parameter data may be applied to practicing the present invention. 

Fig. 11 shows execution of a WRITE command by data storage system 12 wherein the data blocks received 
in on DTU are compressed then recorded as a group of compressed data blocks. Step 210 receives a WRITE 

3D command 1 30. Step 21 1 sets the link commanded in field 1 35 for reporting t he actual number of sectors used 
to store the resultant group of compressed data blocks and a compresston ratio CR achieved. Step 212 sets 
a compress mode in data storage system 1 2 for activating CD 1 01 to compress the data blocks being received 
into one continuum of compressed data. Step 213 receives, compresses and stores the DTU data bkxics. Step 
216 compares the number of sectors actually used to store the compressed data with the number of sectors 

35 inttially allocated. Step 217 compares the byte count of the original data bk>cks in the received DTU with the 
byte count of the compressed data blocks. In vnosX instances, the byte count of the compressed data blocks 
will be less than the byte count of the original DTU data t>tocks. In this instance, at step 21 8, data storage sys- 
tem 12 indicates to host processor 11 that the data storage operation h^ been completed. The identification 
of any unused sectors plus other information describing the just-completed data recording operation is to be 

40 transferred from data storage system 12 to host processor 11 . This transfer is effected by host processor 11 
responding to the indication of a completed recording operation by issuing a READ BUFFER command 140 
to data storage system 12 to send the number of unused alk)cated sectors and all other compresston infor- 
mation to host processor 1 1 . Host processor 1 2 in step 219 responds to the indication of unused allocated sec- 
tors to deallocate such sectors for use in storing other data. Note ttiat if the compress bit 134 is off, then no 

45 oompresskm occurs. 

If at step 217, it is determined that the data oompressu>n resulted more data bytes in the compressed data 
blocks than were in the original data blocks, then the data blocks wll be recorded witiiout data compresston. 
This growth in size of the compressed data t>locks may occur when the original data blocks have certain data 
patterns. In any event, at step 220, data storage system 1 2 sends a channel command retry (CCR) or its equiv- 

so alent to host processor 11. CCR indicates that the DTU has to be retransmitted t>y host processor 11 to data 
storage ^tem 12. That is. the increased in size of the DTU after compression is considered an error conditk>n. 
The CCR indk»tes that a recording error has occurred. Host processor 11 responds to the CCR at step 221 
byresending the DTU to data storage system II.Atstep 222, data stor^esy^em 12 stores the DTU without 
data oompressbn. The above-described operations are exited from either step 219 or 222. 

55 Rg. 1 2 is a flow chart showing system operations for reading data. Host processor 11 in step 225 prepares 

to read data, L,e. kientifies the data blocks to be read. Host processor 11 then in step 226 searches for a file 
directory (Rg. 8A). Such file directory may be read from disk 30. If there is no file directory relating to cono- 
presston, then the date are not compressed. Also, if the f ieM 1 66 of the F^. 8 Illustrated directory forthe kterv 
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tified group is zero, then that group is not compressed. Further, if data to be read are compressed and it is 
desired to decompress in a unit other the storing data storage system 12, step 226 directs host processor op- 
erations to read all identified data without decompression via path 227. From path 227, a usual data recording 
operation not involving data compression is performed (not shown). Host processor 1 1 bulds issues one READ 

5 command 145 for each of the recorded groups of compressed data blocks to be read. Depending on the desired 
read operation, field 150 or READ command wfll be set to Indicate either decompress or no decompress OFF. 
Host processor 11 before sending the READ command 145 to data storage system 12 examines field 150 at 
step 226. If host processor 11 at step 226 finds that the data to be read are compressed and decompression 
is desired, then step 230 sets field 150 to compress ON. All of the groups of compressed data blocks having 

10 data blocks to be read are identified in step 231 via examination of the appropriate file directory 161-163. Host 
processor 11 in step 232 then builds one or more READ commands 145 for reading the step 231 identified 
groups of compressed data bk>Gks with decompresston. The term build used above indicates that the appro- 
priate control data are inserted into a READ command for comnoanding data storage system 12 to perform a 
desired read. Such command includes the number of l-BA addressed sectors to be read as well as the logteal 

IS address in LBAof a firstone of the sectors. One READ comn^and is sent by host processor 11 to data storage 
system 12 in step 232, there can be a number of READ conrvnands sent for fetching a plurality groups of 
record bkxdcs. Data storage system 12 receives the READ command. At step 233, data storage system checks 
the sector compress bit of the first sector storing the requested group to be read. If bit C 177 (Rg. SB) is unity, 
then the data are compressed. Data storage system 12 then in step 234 reads the requested group including 

20 decompressing the data. It is to be noted, that if the READ command field 150 indicates decompressk>n is 
OFF, then no decompressnn occurs even if bit C 177 is set to unity. On the other hand, if bit 0 177 equals 
zero (data in the sector are not compressed), the at step 235 data storage system 12 reads and sends the 
read data without decompression to host processor 11 . The Fig. 2 illustrated system exits the read operation 
for one group from either step 234 or 235. 

25 Fig. 1 3 illustrates operation of data storage system responding to a READ command 145. Step 236 receives 
the READ command. Step 237 checks the compress field 1 50. If the compress field indicates that decompress 
is ON, then C bit 177 of the sector being accessed is checked to ensure that the data to be read is in feet 
recorded and stored in a compressed form. Step 238 executes the READ command by decompressing the 
data being read if field 150 indicates compressbn and C bit 177 is ON. If the field 150 indicates decompression 

30 If OFF, the data stored in the addressed sectors are transferred without decompression whether compressed 
or not That is, in all cases, data storage system 12 transfers the data without deoompresston if fiekJ 150 indi- 
cates compress is OFF. This control enables transferring data in either compressed or decompressed forra 
Fig. 14 illustrates one application of the invention in a system having linked host processors. Both batch 
and in line data oompressk>n/decompresslon are employed. Compressk>n-decompressk>n softwara modules 

35 251 and 273 provide batch data compression and decompression while Integrated circuit chips (hardware conv 
press decompress) 253 and 272 provMe in line (real time) data oompresston-decompresskui Two data proo> 
essing systems 240 and 241 are linked l>y data link 263. Link 263 may be a local area network (LAN), a data 
communication circuit or transfer of a renrtovable data cartridge manually or via a library, maO etc between the 
two data processing systems. Host processor 250 in system 240 has a software compress-decompress facBity 

40 251, a transfer link fecHity 252 that involves no compressnn or decompressbn and an in-line hardware com- 
press-decompress facflity 253. Facilities 251-253 may be physically located in data processing system 240 in 
host processor 250 or as a part of a channel connection that includes logk: switch 254 (programmed or hard- 
ware) connecting host processor 250 to facOities 251-253. Dashed line 255 indicates that switch 254 is pro- 
grammingly controlled by host processor 250. Agiven data processing system may have only 1 ) thatch compress 

45 fedlity 251 and link fecaity 252,2) in-line facflfty 253 and link facility 252, 3) cdl facflifies251-253or4)either 
facility 251 or 253 way be located either in data storage system 262 or data link 263. 

The input-output (lO) connections from facilities 251-253 are effected by logic switch 260 that is progranv- 
mingly controlled by ho^ processor 250 as indicated by dashed line 261 . Switch 260 directs lO data fk>w be- 
tween facilities 251-253 and a data storage system 262 or data line 263. 

50 Data processing system 241 is shown as being identical to data processir^ system 240. Data processing 
system 241 includes host processor 270 that may have a different computational arrangement and capabiity 
from host processor 250, logic switch 271 , facflities 272-274, data storage system 275 and switch 277 that se- 
lectively connects data processing system 241 to data link 263 to other systems and data processing system 
240. 

S5 Fig. 15 illustrates updating a recorded group of compressed data bk>cks. Host processor 11 in step 280 . . 
has updated data blocks and desire to update af Be recorded in data storage system 12 as a pluralfty of groups 
of compressed data blocks. Step 281 compares the data length (number of uncompressed data bytes) of the 
updating DTU with the number of bytes in sectors currentty recorded as one group to be updated. Host proc- 
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essor 11 also examines the number of padding bytes in a last sector storing compressed data for estimating 
whether or not the updated data blocks are storable in the currently allocated sectors for the group(8) to be 
updated. 

At step 282 host processor 11 determines whether or not the updating DTU can be stored in currently al- 
5 located sectors or if nrK>re or different sectors should be allocated. That is, if the updating DTU has noore data 
bytes than the currently recorded group, then additional sectors are allocated at step 288 (host processor 11 
does the allocation). Such new sectors are preferably contiguous sectors that may not include any sectors con- 
taining the recorded group of data blocks to be updated. Following allocation step 288, the updating DTU is 
recorded at step 289. Then, host processor 11 at step 290 deallocates the sectors containing the group of data 
10 btocks to be updated. The Rg. 2 illustrated system then exits the updating operation from step 290. 

If, at step 282, the number of data bytes in the updating DTU is substantially equal to the number of bytes 
(uncompressed) of the recorded DTU, then the updating occurs at step 283 using the sectors currently storing 
the group to be updated. The Fig. 2 illustrated system then performs step 290 before exiting the updating op- 
eratton. If the updating DTU has fewer bytes than the recorded group, then the updating DTU is recorded in 
IS sectors selected from the sectors containing the group to be updated. The sectors not used to record the up- 
dating DTU are deallocated at step 290. 

it may be deckled that, independentiy of any data growth patterns, to always store the updated data blocks 
in a newly allocated set of sectors and to deallocate or free the sectors storing the current group(s) of conrv- 
pressed data Uocks to be updated. In this situation, steps 288-290 are performed. For example, if there is a 
20 desire to save the original group(s) of compressed data blocks, such original recording nr^y be retained. Host 
processor 11 then updates the appropriate file directory 160-162 and exite the storage operation. 

in the updating operatfon shown in F^. 15, whenever the compressed data has more bytes than the orig- 
inal uncompressed date, the date are recorded in an uncompressed form. The steps shown in Fig. 11 are added 
to the Fig. 1 5 Olustrated sequence. 

25 

Claims 

1. Apparatus for storing date in compressed form in a date storage devk:e having a plurality of addressable 
30 like-sized date storage areas, each for recording a predetermined number of date bytes, the date storage 

devtee being connected to means for receiving date to be recordeid, saki received date being arranged in 
a phirality of addressable data blocks, characterised in that the apparatus comprises, in connbination: 

selection rr>eans in the means for receiving date for selecting one or more date transfer unite of 
date blocks to be recorded, each said transfer unit of dfate blocks having a given number of date bytes 
35 and including one or more of said addressable date blocks; 

compressKNi means connected to the selection means for compressing said transfier unit of data 
btocks to be recorded as a group of compressed data btocks; 

date access means in the device connected to said compression means for recording saki group 
of compressed date t>locks in said addressable date storage areas as one continuum of compressed data; 
40 and 

directory means indtoating which ones of said addressable data storage areas said continuum of 
data is recorded In, and indicating that sakj continuum of data contains saki selected transfer unit of data 
btocks in a compressed form. 

45 2. Apparatus according to claim 1 including: 

allocation means connected to the selection means for responding to the number of data bytes in 
each saki transfer unit of data blocks to indicate that saki transfer unit requires a first nu^ 
dressable data storage areas to record saki transfer unit of date blocks; and 

recording means for recording saki transfer unit as a group of compressed date blocks in a second 
50 number of saki addressable data storage areas, saki second number being equal to or less than said first 

number. 

3. Apparatus according to daim 2 including update means connected to saki selection means and to saki 
allocation means for updating a recorded group of compressed date btocks with updated date blocks, in- 
55 chiding receiving updated ones of saki date blocks and selecting a data transfer unit of data blocks to in- 

chide saki updated data btocks; saki update means being connected to said allocation means for allocating 
a nuntf>er of said addressable data storage areas for receiving and recording saki updated.compressed 
data Uocte and for deallocating ones of saki allocated addressable data storage areas in which are stored 
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the original group of compressed data t)locks to t>e updated. 

Apparatus according to any one of daims 1 to 3 wherein the selection means is connected to the data 
storage device for selecting said given numt)er of data bytes to form a transfer unit in dependence on the 
number of data bytes which are recordable in each of said data storage areas. 

Apparatus according to any one of claims 1 to 4, including range means indicating a range of number of 
bytes to be used for transferring data blocks between said means for receiving data and said data storage 
device, wherein said selection means is connected to said range means for receiving said range indication 
and responding to the received range indication for selecting a predetermined number of said data blocks 
to be a data transfer unit and to be in one of said groups of data blocks such that each said group of data 
btocks has a number of data bytes equivalent to a plurality of unconru>ressed data bk)cks. 

Apparatus according to any one of claims 1 to 5 wherein the apparatus inchides: 
CKD means for supplying a plurality of CKD data blocks; 

a CKD formatted disk within the data storage device for receiving and recording CKD data; 

said selectk>n means being connected to said CKD means for receiving and selecting a predeter-* 
mined number of said CKD data blocks as a data transfer unit of sakj CKD data blocks; 

said data access means having CKD recording means for recording sakj transfer unit of CKD data 
as compressed by said compresskm means as a single record on saM CKD formatted disk; and 

repeat means connected to saki selectkm means and to saki CKD means for repeatedly actuating 
the CKD n^ans to supply a transfer unit of CKD data blocks for compresskm and recording in respective 
single CKD records. 

Apparatus according to any one of claims 1 to 5, including: 

a host processor connected to a peripheral controller, saki data stor^e devtee being connected to 
sakj peripheral controller; 

an FBA sectored disk in said data storage device having a plurality of addressable sectors for re- 
ceiving and recording data blocks; 

said selection means having means for selecting sakj data blocks for said data transfer unit to be 
recorded in a predetermined number of sectors on saki FBA sectored disk; and 

repeat n^eans connected to sakj selection means for repeatedly actuating the selection means for 
selecting a plurality of said transfer units of data blocks from one file of such data blocks for compression 
and recording of sakj transfer units of data blocks such that said file of data bk>cks is recorded in conv 
pressed form on sakj FBA sectored disk in a plurality of sakj continuum of data wherein each sakj continua 
consists of one sakj group of compressed data blocks. 

Apparatus according to any one of the preceding daims, including data recording management means 
connected to said directory means and to said data access means for actuating the directory means to 
establish a plurality of said file directories, one file directory for each file of data recorded in compressed 
form; sakj recording management means actuating sakj directory nrteans to record in each of sakj f fle di- 
rectories a number of saki data blocks to be induded in each of saki data transfer units of data and in- 
duding recording a maximum number of bytes to be induded in any one of said data transfer units. 

A method of compressing and recording onto a data storage medium data of a file which is arranged in 
a plurality of addressat>le data blocks, the nr>ethod comprising the steps of: 

selecting a plurality of saki data blocks of sakj file to be compressed and recorded; 

segmenting the selected plurality of addressable data blocks into one or more data transfer units; 

compressing each of said one or more data transfer units and recording them as respective sep- 
arate groups of compressed data blocks; and 

creating and maintaining a f3e directory indicating the address and size of each of saki recorded 
groups for enabling random access to recorded data within saki file of data blocks. 

A method according to daim 9 induding the steps of: 

before recording one of said groups of compressed data blocks, allocating a first numt>er of ad- 
dressable data storage areas of the storage medium for recording saki one group of compressed data 
bkxd(s;and 

after recording said one group of compressed data blocks, deallocating allocated addressable data 
storage areas, if any, into which saki one group of compressed data bk>dcs was not recorded. 
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11. A method according to daim 9 or daim 10. induding: 

supplying CKD formatted data blocks of one CKD fornnatted f ae and selecting said CKD data tAocks 
to be compressed and recorded; 

compressing one or nmre data transfer units of said CKD data blocks into one or wore groups, re- 
spectively, of compressed CKD data blocks; and 

recording the one or more groups of compressed CKD data blocks as one record on a CKD for- 
matted record member. 

12. A method according to daim 9 or daim 10, induding: 

selecting an FBA formatted record medium to be said record medium, said FBA formatted record 
medium having a plurality of addressable data-storing sectors, each data-storing sector being capable of 
recording a given number of data bytes; and 

selecting sakJ data transfer unit to have a first predetermined number of saki data fcrfocks having 
a nunriber of uncompressed data bytes equal to a data storage capacity, in data bytes, of a second pre- 
determined number of sakl data-storing sectors. 

13. A method according to any one of daofns 9 to 12, induding: 

setting a range of number of bytes to be induded in each of sakj data transfer units; and 
selecting a number of sakJ data blocks such that a numt>er of data bytes in the selected numl>er 
of data blocks is within the set range. 
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